AITopics | unstructured dataset

Collaborating Authors

unstructured dataset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

525bd8aedafa375564f73bacdef411e5-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-13-2026, 07:23:28 GMT

Mutual Information (MI) is a fundamental metric for quantifying dependency between two random variables.

artificial intelligence, dataset, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

525bd8aedafa375564f73bacdef411e5-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsOct-10-2025, 02:36:41 GMT

dataset, estimator, mi estimator, (14 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Information Management (0.69)
Information Technology > Data Science (0.68)

Add feedback

Multimodal Structure-Aware Quantum Data Processing

Hawashin, Hala, Sadrzadeh, Mehrnoosh

arXiv.org Artificial IntelligenceJan-12-2025

While large language models (LLMs) have advanced the field of natural language processing (NLP), their "black box" nature obscures their decision-making processes. To address this, researchers developed structured approaches using higher order tensors. These are able to model linguistic relations, but stall when training on classical computers due to their excessive size. Tensors are natural inhabitants of quantum systems and training on quantum computers provides a solution by translating text to variational quantum circuits. In this paper, we develop MultiQ-NLP: a framework for structure-aware data processing with multimodal text+image data. Here, "structure" refers to syntactic and grammatical relationships in language, as well as the hierarchical organization of visual elements in images. We enrich the translation with new types and type homomorphisms and develop novel architectures to represent structure. When tested on a main stream image classification task (SVO Probes), our best model showed a par performance with the state of the art classical models; moreover the best model was fully structured.

artificial intelligence, dataset, natural language, (15 more...)

arXiv.org Artificial Intelligence

2411.04242

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
(7 more...)

Genre: Research Report (0.50)

Industry: Information Technology > Software (0.61)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Multimodal Quantum Natural Language Processing: A Novel Framework for using Quantum Methods to Analyse Real Data

Hawashin, Hala

arXiv.org Artificial IntelligenceOct-29-2024

Despite significant advances in quantum computing across various domains, research on applying quantum approaches to language compositionality - such as modeling linguistic structures and interactions - remains limited. This gap extends to the integration of quantum language data with real-world data from sources like images, video, and audio. This thesis explores how quantum computational methods can enhance the compositional modeling of language through multimodal data integration. Specifically, it advances Multimodal Quantum Natural Language Processing (MQNLP) by applying the Lambeq toolkit to conduct a comparative analysis of four compositional models and evaluate their influence on image-text classification tasks. Results indicate that syntax-based models, particularly DisCoCat and TreeReader, excel in effectively capturing grammatical structures, while bag-of-words and sequential models struggle due to limited syntactic awareness. These findings underscore the potential of quantum methods to enhance language modeling and drive breakthroughs as quantum technology evolves.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2411.05023

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.93)
Information Technology > Security & Privacy (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)

Add feedback

A Benchmark Suite for Evaluating Neural Mutual Information Estimators on Unstructured Datasets

Lee, Kyungeun, Rhee, Wonjong

arXiv.org Machine LearningOct-14-2024

Mutual Information (MI) is a fundamental metric for quantifying dependency between two random variables. When we can access only the samples, but not the underlying distribution functions, we can evaluate MI using sample-based estimators. Assessment of such MI estimators, however, has almost always relied on analytical datasets including Gaussian multivariates. Such datasets allow analytical calculations of the true MI values, but they are limited in that they do not reflect the complexities of real-world datasets. This study introduces a comprehensive benchmark suite for evaluating neural MI estimators on unstructured datasets, specifically focusing on images and texts. By leveraging same-class sampling for positive pairing and introducing a binary symmetric channel trick, we show that we can accurately manipulate true MI values of real-world datasets. Using the benchmark suite, we investigate seven challenging scenarios, shedding light on the reliability of neural MI estimators for unstructured datasets.

dataset, estimator, information, (14 more...)

arXiv.org Machine Learning

2410.10924

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Information Management (0.69)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Topic Segmentation in the Wild: Towards Segmentation of Semi-structured & Unstructured Chats

Ghosh, Reshmi, Kajal, Harjeet Singh, Kamath, Sharanya, Shrivastava, Dhuri, Basu, Samyadeep, Srinivasan, Soundararajan

arXiv.org Artificial IntelligenceNov-27-2022

Breaking down a document or a conversation into multiple contiguous segments based on its semantic structure is an important and challenging problem in NLP, which can assist many downstream tasks. However, current works on topic segmentation often focus on segmentation of structured texts. In this paper, we comprehensively analyze the generalization capabilities of state-of-the-art topic segmentation models on unstructured texts. We find that: (a) Current strategies of pre-training on a large corpus of structured text such as Wiki-727K do not help in transferability to unstructured texts.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2211.14954

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > Maryland (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.70)

Add feedback

Accelerate Internship Opportunities

#artificialintelligenceMay-1-2020, 01:58:14 GMT

Analytica Advisors is a boutique consulting firm engaged in building global capital markets for sustainability leaders among both long-term investors and companies. It focuses on financial and technical innovation for its clients the world over. Development of natural language processing and machine learning tools to analyze structured and unstructured datasets in high and low carbon sectors using company annual reports. COVID19 has accelerated shifts in global energy market such that energy companies and banks may be impacted in the medium to long term. For investors and policy makers it is important to understand the scope and scale of potential change/consolidation from the point of view of individual companies.

accelerate internship opportunity, ammonia and hydrogen, unstructured dataset, (7 more...)

#artificialintelligence

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.81)

Add feedback

Reconstructing Self Organizing Maps as Spider Graphs for better visual interpretation of large unstructured datasets

Prakash, Aaditya

arXiv.org Machine LearningDec-24-2012

Self-Organizing Maps (SOM) are popular unsupervised artificial neural network used to reduce dimensions and visualize data. Visual interpretation from Self-Organizing Maps (SOM) has been limited due to grid approach of data representation, which makes inter-scenario analysis impossible. The paper proposes a new way to structure SOM. This model reconstructs SOM to show strength between variables as the threads of a cobweb and illuminate inter-scenario analysis. While Radar Graphs are very crude representation of spider web, this model uses more lively and realistic cobweb representation to take into account the difference in strength and length of threads. This model allows for visualization of highly unstructured dataset with large number of dimensions, common in Bigdata sources.

artificial intelligence, dataset, machine learning, (16 more...)

arXiv.org Machine Learning

1301.0289

Genre: Research Report (0.82)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback